36 research outputs found

    A QSAR classification model of skin sensitization potential based on improving binary crow search algorithm

    Get PDF
    Classifying of skin sensitization using the quantitative structure-activityrelationship (QSAR) model is important. Applying descriptor selection isessential to improve the performance of the classification task. Recently, abinary crow search algorithm (BCSA) was proposed, which has been successfully applied to solve variable selection. In this work, a new time-varyingtransfer function is proposed to improve the exploration and exploitation capability of the BCSA in selecting the most relevant descriptors in QSAR classification model with high classification accuracy and short computing time.The results demonstrated that the proposed method is reliable and can reasonably separate the compounds according to sensitizers or non-sensitizerswith high classification accuracy

    Applying Penalized Binary Logistic Regression with Correlation Based Elastic Net for Variables Selection

    Get PDF
    Reduction of the high dimensional classification using penalized logistic regression is one of the challenges in applying binary logistic regression. The applied penalized method, correlation based elastic penalty (CBEP), was used to overcome the limitation of LASSO and elastic net in variable selection when there are perfect correlation among explanatory variables. The performance of the CBEP was demonstrated through its application in analyzing two well-known high dimensional binary classification data sets. The CBEP provided superior classification performance and variable selection compared with other existing penalized methods. It is a reliable penalized method in binary logistic regression

    Restricted ride estimator in the Inverse Gaussian regression model

    Get PDF
    The inverse Gaussian regression (IGR) model is a well-known model in application when the response variable positively skewed. Its parameters are usually estimated using maximum likelihood (ML) method. However, the ML method is very sensitive to multicollinearity. Ridge estimator was proposed in inverse gaussian regression model. A restricted ridge estimator is proposed. Simulation and real data example results demonstrate that the proposed estimator is outperformed ML and inverse Gaussian ridge estimator

    Restricted ride estimator in the Inverse Gaussian regression model

    Get PDF
    The inverse Gaussian regression (IGR) model is a well-known model in application when the response variable positively skewed. Its parameters are usually estimated using maximum likelihood (ML) method. However, the ML method is very sensitive to multicollinearity. Ridge estimator was proposed in inverse gaussian regression model. A restricted ridge estimator is proposed. Simulation and real data example results demonstrate that the proposed estimator is outperformed ML and inverse Gaussian ridge estimator

    Non-transformed principal component technique on weekly construction stock market price

    Get PDF
    The fast-growing urbanization has contributed to the construction sector be- coming one of the major sectors traded in the world stock market. In general, non- stationarity is highly related to most of the stock market price pattern. Even though stationarity transformation is a common approach, yet this may prompt to originality loss of the data. Hence, the non-transformation technique using a generalized dynamic principal component (GDPC) were considered for this study. Comparison of GDPC was performed with two transformed principal component techniques. This is pertinent as to observe a larger perspective of both techniques. Thus, the latest weekly two-years observations of nine constructions stock market price from seven different countries were applied. The data was tested for stationarity before performing the analysis. As a re- sult, the mean squared error in the non-transformed technique shows eight lowest values. Similarly, eight construction stock market prices had the highest percentage of explained variance. In conclusion, a non-transformed technique can also present a better result outcome without the stationarity transformation

    Variable selection in gamma regression model using chaotic firefly algorithm with application in chemometrics

    Get PDF
    Variable selection is a very helpful procedure for improving computational speed and prediction accuracy by identifying the most important variables that related to the response variable. Regression modeling has received much attention in several science fields. Firefly algorithm is one of the recently efficient proposed nature-inspired algorithms that can efficiently be employed for variable selection. In this work, chaotic firefly algorithm is proposed to perform variable selection for gamma regression model.  A real data application related to the chemometrics is conducted to evaluate the performance of the proposed method in terms of prediction accuracy and variable selection criteria. Further, its performance is compared with other methods. The results proved the efficiency of our proposed methods and it outperforms other popular methods

    Tuning parameter selectors for bridge penalty based on particle swarm optimization method

    Get PDF
    The bridge penalty is widely used as a penalty for selecting and shrinking predictors in regression models. Although its effectiveness is sensitive to the parameters you decide to use for shrinking and adjusting. The shrinkage and tuning parameters of the bridge penalty are chosen concurrently, and a continuous optimization process called particle swarm optimization is proposed as a means to do this. If implemented, the proposed method will greatly facilitate regression modeling with superior prediction performance. The results show that the proposed method is effective in comparison to other well-known methods, but this varies greatly depending on the simulation setup and the real data application

    Almost Unbiased Ridge Estimator in the Inverse Gaussian Regression Model

    Get PDF
    The inverse Gaussian regression (IGR) model is a very common model when the shape of the response variable is positively skewed. The traditional maximum likelihood estimator (MLE) is used to estimate the IGR model parameters. However, when multicollinearity is existed among the explanatory variables, the MLE becomes not efficient estimator as the mean squared error (MSE) becomes inflated. In order to remedy this problem, the ridge estimator (RE) is used. In this paper, we present an almost unbiased ridge estimator for the IGR model in order to overcome multicollinearity problem. We also investigate the performance of the almost unbiased ridge estimator using a Monte Carlo simulation. The results of the almost unbiased ridge estimator are compared with those of the MLE and of the RE in terms of the MSE measure. In addition, a real example of dataset is used and the results show that the performance of the suggested estimator is superior when the multicollinearity is presented among the explanatory variables in the IGR model

    Almost Unbiased Ridge Estimator in the Inverse Gaussian Regression Model

    Get PDF
    The inverse Gaussian regression (IGR) model is a very common model when the shape of the response variable is positively skewed. The traditional maximum likelihood estimator (MLE) is used to estimate the IGR model parameters. However, when multicollinearity is existed among the explanatory variables, the MLE becomes not efficient estimator as the mean squared error (MSE) becomes inflated. In order to remedy this problem, the ridge estimator (RE) is used. In this paper, we present an almost unbiased ridge estimator for the IGR model in order to overcome multicollinearity problem. We also investigate the performance of the almost unbiased ridge estimator using a Monte Carlo simulation. The results of the almost unbiased ridge estimator are compared with those of the MLE and of the RE in terms of the MSE measure. In addition, a real example of dataset is used and the results show that the performance of the suggested estimator is superior when the multicollinearity is presented among the explanatory variables in the IGR model

    Improving penalized logistic regression model with missing values in high-dimensional data

    Get PDF
    Analysis without adequate handling of missing values may lead to inconsistent and biased estimates. Despite multiple imputations becoming a widely used approach in handling missing data, manuscript researchers generally encounter missing data in their respective studies. In high-dimensional data, penalized regression is a popular technique for performing feature selection and coefficient estimation simultaneously. However, one of the most vital issues with high-dimensional data is that it often contains large quantities of missing data that common multiple imputation approaches may not work correctly. Therefore, this study uses imputations penalized regression models as an extension of the penalized methods to improve the performance and impute missing values in high-dimensional data. The method was applied to real-life highdimensional datasets for the different number of features, sample sizes, and missing dataset rates to evaluate its efficiency. The method was also compared with other existing imputation penalized methods for high-dimensional data. The comparative experimental results indicate that the proposed method outperforms its competitors by achieving higher sensitivity, specificity, and classification accuracy values
    corecore